Overview
Dataset statistics
| Number of variables | 21 |
|---|---|
| Number of observations | 2000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 601.1 KiB |
| Average record size in memory | 307.8 B |
Variable types
| Text | 1 |
|---|---|
| Categorical | 6 |
| Numeric | 14 |
Air_Pollution is highly overall correlated with Overall_Risk_Score | High correlation |
Cancer_Type is highly overall correlated with Gender | High correlation |
Gender is highly overall correlated with Cancer_Type | High correlation |
Overall_Risk_Score is highly overall correlated with Air_Pollution and 1 other fields | High correlation |
Risk_Level is highly overall correlated with Overall_Risk_Score | High correlation |
BRCA_Mutation is highly imbalanced (79.3%) | Imbalance |
Patient_ID has unique values | Unique |
Overall_Risk_Score has unique values | Unique |
Smoking has 169 (8.5%) zeros | Zeros |
Alcohol_Use has 204 (10.2%) zeros | Zeros |
Obesity has 109 (5.5%) zeros | Zeros |
Diet_Red_Meat has 153 (7.6%) zeros | Zeros |
Diet_Salted_Processed has 184 (9.2%) zeros | Zeros |
Fruit_Veg_Intake has 186 (9.3%) zeros | Zeros |
Physical_Activity has 242 (12.1%) zeros | Zeros |
Air_Pollution has 148 (7.4%) zeros | Zeros |
Occupational_Hazards has 192 (9.6%) zeros | Zeros |
Calcium_Intake has 355 (17.8%) zeros | Zeros |
Physical_Activity_Level has 214 (10.7%) zeros | Zeros |
Reproduction
| Analysis started | 2025-11-29 15:03:00.501285 |
|---|---|
| Analysis finished | 2025-11-29 15:03:26.556925 |
| Duration | 26.06 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
Patient_ID
Text
Unique
| Distinct | 2000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 107.6 KiB |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 2000 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | LU0000 |
|---|---|
| 2nd row | LU0001 |
| 3rd row | LU0002 |
| 4th row | LU0003 |
| 5th row | LU0004 |
| Value | Count | Frequency (%) |
| lu0000 | 1 | < 0.1% |
| lu0015 | 1 | < 0.1% |
| lu0002 | 1 | < 0.1% |
| lu0003 | 1 | < 0.1% |
| lu0004 | 1 | < 0.1% |
| lu0005 | 1 | < 0.1% |
| lu0006 | 1 | < 0.1% |
| lu0007 | 1 | < 0.1% |
| lu0008 | 1 | < 0.1% |
| lu0009 | 1 | < 0.1% |
| Other values (1990) | 1990 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2900 | |
| 3 | 900 | 7.5% |
| 1 | 900 | 7.5% |
| 2 | 900 | 7.5% |
| R | 800 | 6.7% |
| 6 | 400 | 3.3% |
| T | 400 | 3.3% |
| S | 400 | 3.3% |
| O | 400 | 3.3% |
| C | 400 | 3.3% |
| Other values (9) | 3600 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 12000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2900 | |
| 3 | 900 | 7.5% |
| 1 | 900 | 7.5% |
| 2 | 900 | 7.5% |
| R | 800 | 6.7% |
| 6 | 400 | 3.3% |
| T | 400 | 3.3% |
| S | 400 | 3.3% |
| O | 400 | 3.3% |
| C | 400 | 3.3% |
| Other values (9) | 3600 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 12000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2900 | |
| 3 | 900 | 7.5% |
| 1 | 900 | 7.5% |
| 2 | 900 | 7.5% |
| R | 800 | 6.7% |
| 6 | 400 | 3.3% |
| T | 400 | 3.3% |
| S | 400 | 3.3% |
| O | 400 | 3.3% |
| C | 400 | 3.3% |
| Other values (9) | 3600 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 12000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 2900 | |
| 3 | 900 | 7.5% |
| 1 | 900 | 7.5% |
| 2 | 900 | 7.5% |
| R | 800 | 6.7% |
| 6 | 400 | 3.3% |
| T | 400 | 3.3% |
| S | 400 | 3.3% |
| O | 400 | 3.3% |
| C | 400 | 3.3% |
| Other values (9) | 3600 |
Cancer_Type
Categorical
High correlation
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 106.1 KiB |
| Lung | |
|---|---|
| Breast | |
| Colon | |
| Prostate | |
| Skin |
Length
| Max length | 8 |
|---|---|
| Median length | 6 |
| Mean length | 5.279 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Breast |
|---|---|
| 2nd row | Prostate |
| 3rd row | Skin |
| 4th row | Colon |
| 5th row | Lung |
Common Values
| Value | Count | Frequency (%) |
| Lung | 527 | |
| Breast | 460 | |
| Colon | 418 | |
| Prostate | 305 | |
| Skin | 290 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| lung | 527 | |
| breast | 460 | |
| colon | 418 | |
| prostate | 305 | |
| skin | 290 |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1235 | |
| o | 1141 | |
| t | 1070 | |
| s | 765 | 7.2% |
| r | 765 | 7.2% |
| e | 765 | 7.2% |
| a | 765 | 7.2% |
| u | 527 | 5.0% |
| L | 527 | 5.0% |
| g | 527 | 5.0% |
| Other values (7) | 2471 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10558 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 1235 | |
| o | 1141 | |
| t | 1070 | |
| s | 765 | 7.2% |
| r | 765 | 7.2% |
| e | 765 | 7.2% |
| a | 765 | 7.2% |
| u | 527 | 5.0% |
| L | 527 | 5.0% |
| g | 527 | 5.0% |
| Other values (7) | 2471 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10558 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 1235 | |
| o | 1141 | |
| t | 1070 | |
| s | 765 | 7.2% |
| r | 765 | 7.2% |
| e | 765 | 7.2% |
| a | 765 | 7.2% |
| u | 527 | 5.0% |
| L | 527 | 5.0% |
| g | 527 | 5.0% |
| Other values (7) | 2471 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10558 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 1235 | |
| o | 1141 | |
| t | 1070 | |
| s | 765 | 7.2% |
| r | 765 | 7.2% |
| e | 765 | 7.2% |
| a | 765 | 7.2% |
| u | 527 | 5.0% |
| L | 527 | 5.0% |
| g | 527 | 5.0% |
| Other values (7) | 2471 |
Age
Real number (ℝ)
| Distinct | 61 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 63.248 |
| Minimum | 25 |
|---|---|
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 25 |
|---|---|
| 5-th percentile | 45 |
| Q1 | 56 |
| median | 64 |
| Q3 | 70 |
| 95-th percentile | 80 |
| Maximum | 90 |
| Range | 65 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 10.462946 |
|---|---|
| Coefficient of variation (CV) | 0.1654273 |
| Kurtosis | -0.014791169 |
| Mean | 63.248 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | -0.18140206 |
| Sum | 126496 |
| Variance | 109.47323 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 64 | 85 | 4.2% |
| 65 | 85 | 4.2% |
| 68 | 81 | 4.0% |
| 60 | 81 | 4.0% |
| 67 | 77 | 3.9% |
| 61 | 75 | 3.8% |
| 62 | 73 | 3.6% |
| 63 | 73 | 3.6% |
| 66 | 72 | 3.6% |
| 69 | 71 | 3.5% |
| Other values (51) | 1227 |
| Value | Count | Frequency (%) |
| 25 | 2 | 0.1% |
| 29 | 1 | 0.1% |
| 31 | 1 | 0.1% |
| 32 | 2 | 0.1% |
| 34 | 3 | 0.1% |
| 35 | 1 | 0.1% |
| 36 | 1 | 0.1% |
| 37 | 10 | |
| 38 | 5 | |
| 39 | 5 |
| Value | Count | Frequency (%) |
| 90 | 6 | 0.3% |
| 89 | 5 | 0.2% |
| 88 | 3 | 0.1% |
| 87 | 7 | 0.4% |
| 86 | 7 | 0.4% |
| 85 | 9 | |
| 84 | 5 | 0.2% |
| 83 | 11 | |
| 82 | 18 | |
| 81 | 20 |
Gender
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 97.8 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1022 | |
| 1 | 978 |
Smoking
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.157 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 169 |
| Zeros (%) | 8.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.3253391 |
|---|---|
| Coefficient of variation (CV) | 0.64482045 |
| Kurtosis | -1.2551944 |
| Mean | 5.157 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.055220323 |
| Sum | 10314 |
| Variance | 11.05788 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 343 | |
| 6 | 267 | |
| 1 | 204 | |
| 3 | 175 | |
| 2 | 174 | |
| 0 | 169 | |
| 5 | 166 | |
| 4 | 166 | |
| 9 | 126 | 6.3% |
| 8 | 107 | 5.3% |
| Value | Count | Frequency (%) |
| 0 | 169 | |
| 1 | 204 | |
| 2 | 174 | |
| 3 | 175 | |
| 4 | 166 | |
| 5 | 166 | |
| 6 | 267 | |
| 7 | 103 | 5.1% |
| 8 | 107 | |
| 9 | 126 |
| Value | Count | Frequency (%) |
| 10 | 343 | |
| 9 | 126 | 6.3% |
| 8 | 107 | 5.3% |
| 7 | 103 | 5.1% |
| 6 | 267 | |
| 5 | 166 | |
| 4 | 166 | |
| 3 | 175 | |
| 2 | 174 | |
| 1 | 204 |
Alcohol_Use
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.035 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 204 |
| Zeros (%) | 10.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.2609956 |
|---|---|
| Coefficient of variation (CV) | 0.64766545 |
| Kurtosis | -1.3211465 |
| Mean | 5.035 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.05730029 |
| Sum | 10070 |
| Variance | 10.634092 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 220 | |
| 2 | 210 | |
| 0 | 204 | |
| 9 | 197 | |
| 8 | 195 | |
| 1 | 186 | |
| 10 | 183 | |
| 6 | 183 | |
| 3 | 151 | |
| 4 | 145 |
| Value | Count | Frequency (%) |
| 0 | 204 | |
| 1 | 186 | |
| 2 | 210 | |
| 3 | 151 | |
| 4 | 145 | |
| 5 | 126 | |
| 6 | 183 | |
| 7 | 220 | |
| 8 | 195 | |
| 9 | 197 |
| Value | Count | Frequency (%) |
| 10 | 183 | |
| 9 | 197 | |
| 8 | 195 | |
| 7 | 220 | |
| 6 | 183 | |
| 5 | 126 | |
| 4 | 145 | |
| 3 | 151 | |
| 2 | 210 | |
| 1 | 186 |
Obesity
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.9675 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 109 |
| Zeros (%) | 5.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 4 |
| median | 6 |
| Q3 | 9 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.0613934 |
|---|---|
| Coefficient of variation (CV) | 0.51301105 |
| Kurtosis | -0.96720309 |
| Mean | 5.9675 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.32497539 |
| Sum | 11935 |
| Variance | 9.3721298 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 348 | |
| 6 | 217 | |
| 5 | 210 | |
| 8 | 208 | |
| 4 | 208 | |
| 7 | 190 | |
| 9 | 181 | |
| 2 | 121 | 6.0% |
| 0 | 109 | 5.5% |
| 1 | 109 | 5.5% |
| Value | Count | Frequency (%) |
| 0 | 109 | |
| 1 | 109 | |
| 2 | 121 | |
| 3 | 99 | |
| 4 | 208 | |
| 5 | 210 | |
| 6 | 217 | |
| 7 | 190 | |
| 8 | 208 | |
| 9 | 181 |
| Value | Count | Frequency (%) |
| 10 | 348 | |
| 9 | 181 | |
| 8 | 208 | |
| 7 | 190 | |
| 6 | 217 | |
| 5 | 210 | |
| 4 | 208 | |
| 3 | 99 | 5.0% |
| 2 | 121 | 6.0% |
| 1 | 109 | 5.5% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1611 | |
| 1 | 389 | 19.4% |
Diet_Red_Meat
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.1895 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 153 |
| Zeros (%) | 7.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.1544516 |
|---|---|
| Coefficient of variation (CV) | 0.60785271 |
| Kurtosis | -1.1571786 |
| Mean | 5.1895 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.0079247227 |
| Sum | 10379 |
| Variance | 9.950565 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 261 | |
| 5 | 206 | |
| 7 | 201 | |
| 6 | 189 | |
| 4 | 188 | |
| 3 | 180 | |
| 2 | 170 | |
| 1 | 169 | |
| 0 | 153 | |
| 8 | 150 |
| Value | Count | Frequency (%) |
| 0 | 153 | |
| 1 | 169 | |
| 2 | 170 | |
| 3 | 180 | |
| 4 | 188 | |
| 5 | 206 | |
| 6 | 189 | |
| 7 | 201 | |
| 8 | 150 | |
| 9 | 133 |
| Value | Count | Frequency (%) |
| 10 | 261 | |
| 9 | 133 | |
| 8 | 150 | |
| 7 | 201 | |
| 6 | 189 | |
| 5 | 206 | |
| 4 | 188 | |
| 3 | 180 | |
| 2 | 170 | |
| 1 | 169 |
Diet_Salted_Processed
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.5635 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 184 |
| Zeros (%) | 9.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 4 |
| Q3 | 7 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.0883226 |
|---|---|
| Coefficient of variation (CV) | 0.6767443 |
| Kurtosis | -1.0428935 |
| Mean | 4.5635 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.30094034 |
| Sum | 9127 |
| Variance | 9.5377366 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 260 | |
| 3 | 253 | |
| 2 | 229 | |
| 1 | 187 | |
| 0 | 184 | |
| 10 | 182 | |
| 6 | 167 | |
| 5 | 151 | |
| 9 | 136 | |
| 7 | 126 |
| Value | Count | Frequency (%) |
| 0 | 184 | |
| 1 | 187 | |
| 2 | 229 | |
| 3 | 253 | |
| 4 | 260 | |
| 5 | 151 | |
| 6 | 167 | |
| 7 | 126 | |
| 8 | 125 | |
| 9 | 136 |
| Value | Count | Frequency (%) |
| 10 | 182 | |
| 9 | 136 | |
| 8 | 125 | |
| 7 | 126 | |
| 6 | 167 | |
| 5 | 151 | |
| 4 | 260 | |
| 3 | 253 | |
| 2 | 229 | |
| 1 | 187 |
Fruit_Veg_Intake
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.9275 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 186 |
| Zeros (%) | 9.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.0453047 |
|---|---|
| Coefficient of variation (CV) | 0.61802226 |
| Kurtosis | -1.0837447 |
| Mean | 4.9275 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.018508932 |
| Sum | 9855 |
| Variance | 9.2738807 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 214 | |
| 6 | 214 | |
| 3 | 214 | |
| 5 | 210 | |
| 8 | 195 | |
| 0 | 186 | |
| 10 | 162 | |
| 1 | 159 | |
| 7 | 154 | |
| 2 | 146 |
| Value | Count | Frequency (%) |
| 0 | 186 | |
| 1 | 159 | |
| 2 | 146 | |
| 3 | 214 | |
| 4 | 214 | |
| 5 | 210 | |
| 6 | 214 | |
| 7 | 154 | |
| 8 | 195 | |
| 9 | 146 |
| Value | Count | Frequency (%) |
| 10 | 162 | |
| 9 | 146 | |
| 8 | 195 | |
| 7 | 154 | |
| 6 | 214 | |
| 5 | 210 | |
| 4 | 214 | |
| 3 | 214 | |
| 2 | 146 | |
| 1 | 159 |
Physical_Activity
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.015 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 242 |
| Zeros (%) | 12.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.9784578 |
|---|---|
| Coefficient of variation (CV) | 0.74183257 |
| Kurtosis | -0.84281018 |
| Mean | 4.015 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.45586305 |
| Sum | 8030 |
| Variance | 8.8712106 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 261 | |
| 3 | 254 | |
| 4 | 246 | |
| 0 | 242 | |
| 2 | 225 | |
| 5 | 173 | |
| 6 | 152 | |
| 8 | 117 | |
| 10 | 112 | |
| 7 | 111 |
| Value | Count | Frequency (%) |
| 0 | 242 | |
| 1 | 261 | |
| 2 | 225 | |
| 3 | 254 | |
| 4 | 246 | |
| 5 | 173 | |
| 6 | 152 | |
| 7 | 111 | |
| 8 | 117 | |
| 9 | 107 |
| Value | Count | Frequency (%) |
| 10 | 112 | |
| 9 | 107 | |
| 8 | 117 | |
| 7 | 111 | |
| 6 | 152 | |
| 5 | 173 | |
| 4 | 246 | |
| 3 | 254 | |
| 2 | 225 | |
| 1 | 261 |
Air_Pollution
Real number (ℝ)
High correlation Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.323 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 148 |
| Zeros (%) | 7.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.2074624 |
|---|---|
| Coefficient of variation (CV) | 0.60256667 |
| Kurtosis | -1.2113638 |
| Mean | 5.323 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.0032898067 |
| Sum | 10646 |
| Variance | 10.287815 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 10 | 310 | |
| 4 | 243 | |
| 3 | 211 | |
| 8 | 172 | |
| 5 | 169 | |
| 6 | 161 | |
| 2 | 160 | |
| 0 | 148 | |
| 7 | 144 | |
| 9 | 143 |
| Value | Count | Frequency (%) |
| 0 | 148 | |
| 1 | 139 | |
| 2 | 160 | |
| 3 | 211 | |
| 4 | 243 | |
| 5 | 169 | |
| 6 | 161 | |
| 7 | 144 | |
| 8 | 172 | |
| 9 | 143 |
| Value | Count | Frequency (%) |
| 10 | 310 | |
| 9 | 143 | |
| 8 | 172 | |
| 7 | 144 | |
| 6 | 161 | |
| 5 | 169 | |
| 4 | 243 | |
| 3 | 211 | |
| 2 | 160 | |
| 1 | 139 |
Occupational_Hazards
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.979 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 192 |
| Zeros (%) | 9.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.2128991 |
|---|---|
| Coefficient of variation (CV) | 0.64529003 |
| Kurtosis | -1.1775541 |
| Mean | 4.979 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.074888453 |
| Sum | 9958 |
| Variance | 10.32272 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 243 | |
| 10 | 238 | |
| 4 | 212 | |
| 0 | 192 | |
| 1 | 182 | |
| 3 | 173 | |
| 9 | 168 | |
| 2 | 164 | |
| 6 | 152 | |
| 7 | 146 |
| Value | Count | Frequency (%) |
| 0 | 192 | |
| 1 | 182 | |
| 2 | 164 | |
| 3 | 173 | |
| 4 | 212 | |
| 5 | 243 | |
| 6 | 152 | |
| 7 | 146 | |
| 8 | 130 | |
| 9 | 168 |
| Value | Count | Frequency (%) |
| 10 | 238 | |
| 9 | 168 | |
| 8 | 130 | |
| 7 | 146 | |
| 6 | 152 | |
| 5 | 243 | |
| 4 | 212 | |
| 3 | 173 | |
| 2 | 164 | |
| 1 | 182 |
BRCA_Mutation
Categorical
Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 97.8 KiB |
| 0 | |
|---|---|
| 1 | 65 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1935 | |
| 1 | 65 | 3.2% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2000 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 1607 | |
| 1 | 393 | 19.7% |
Calcium_Intake
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9405 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 355 |
| Zeros (%) | 17.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 4 |
| Q3 | 6 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.0488699 |
|---|---|
| Coefficient of variation (CV) | 0.77372665 |
| Kurtosis | -0.95611768 |
| Mean | 3.9405 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.34952778 |
| Sum | 7881 |
| Variance | 9.2956076 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 355 | |
| 1 | 210 | |
| 5 | 205 | |
| 4 | 202 | |
| 2 | 201 | |
| 3 | 192 | |
| 6 | 181 | |
| 7 | 166 | |
| 10 | 110 | 5.5% |
| 8 | 90 | 4.5% |
| Value | Count | Frequency (%) |
| 0 | 355 | |
| 1 | 210 | |
| 2 | 201 | |
| 3 | 192 | |
| 4 | 202 | |
| 5 | 205 | |
| 6 | 181 | |
| 7 | 166 | |
| 8 | 90 | 4.5% |
| 9 | 88 | 4.4% |
| Value | Count | Frequency (%) |
| 10 | 110 | |
| 9 | 88 | |
| 8 | 90 | |
| 7 | 166 | |
| 6 | 181 | |
| 5 | 205 | |
| 4 | 202 | |
| 3 | 192 | |
| 2 | 201 | |
| 1 | 210 |
Overall_Risk_Score
Real number (ℝ)
High correlation Unique
| Distinct | 2000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.45444884 |
| Minimum | 0.029284507 |
|---|---|
| Maximum | 0.85215847 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0.029284507 |
|---|---|
| 5-th percentile | 0.25396347 |
| Q1 | 0.36698156 |
| median | 0.45539924 |
| Q3 | 0.53978165 |
| 95-th percentile | 0.66163792 |
| Maximum | 0.85215847 |
| Range | 0.82287396 |
| Interquartile range (IQR) | 0.17280009 |
Descriptive statistics
| Standard deviation | 0.12307394 |
|---|---|
| Coefficient of variation (CV) | 0.27082023 |
| Kurtosis | -0.29088733 |
| Mean | 0.45444884 |
| Median Absolute Deviation (MAD) | 0.086175833 |
| Skewness | 0.016484704 |
| Sum | 908.89768 |
| Variance | 0.015147194 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.398696098 | 1 | 0.1% |
| 0.366688926 | 1 | 0.1% |
| 0.326865648 | 1 | 0.1% |
| 0.239170043 | 1 | 0.1% |
| 0.441567082 | 1 | 0.1% |
| 0.34484396 | 1 | 0.1% |
| 0.528886977 | 1 | 0.1% |
| 0.476888763 | 1 | 0.1% |
| 0.497251035 | 1 | 0.1% |
| 0.359792262 | 1 | 0.1% |
| Other values (1990) | 1990 |
| Value | Count | Frequency (%) |
| 0.029284507 | 1 | |
| 0.086929623 | 1 | |
| 0.104906581 | 1 | |
| 0.110532649 | 1 | |
| 0.112548426 | 1 | |
| 0.126186956 | 1 | |
| 0.129169485 | 1 | |
| 0.143059575 | 1 | |
| 0.143226955 | 1 | |
| 0.147960348 | 1 |
| Value | Count | Frequency (%) |
| 0.852158468 | 1 | |
| 0.814066187 | 1 | |
| 0.813508381 | 1 | |
| 0.770790301 | 1 | |
| 0.769561703 | 1 | |
| 0.765889807 | 1 | |
| 0.765827016 | 1 | |
| 0.764008079 | 1 | |
| 0.754868639 | 1 | |
| 0.75302407 | 1 |
BMI
Real number (ℝ)
| Distinct | 208 |
|---|---|
| Distinct (%) | 10.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.18335 |
| Minimum | 15 |
|---|---|
| Maximum | 41.4 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 15 |
|---|---|
| 5-th percentile | 19.7 |
| Q1 | 23.5 |
| median | 26.2 |
| Q3 | 28.7 |
| 95-th percentile | 32.705 |
| Maximum | 41.4 |
| Range | 26.4 |
| Interquartile range (IQR) | 5.2 |
Descriptive statistics
| Standard deviation | 3.9474585 |
|---|---|
| Coefficient of variation (CV) | 0.15076217 |
| Kurtosis | 0.012211472 |
| Mean | 26.18335 |
| Median Absolute Deviation (MAD) | 2.6 |
| Skewness | 0.047668228 |
| Sum | 52366.7 |
| Variance | 15.582429 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 25.9 | 34 | 1.7% |
| 26.3 | 26 | 1.3% |
| 25 | 25 | 1.2% |
| 26.8 | 25 | 1.2% |
| 24.1 | 24 | 1.2% |
| 26.7 | 24 | 1.2% |
| 26.1 | 23 | 1.1% |
| 25.1 | 23 | 1.1% |
| 23.4 | 23 | 1.1% |
| 26.5 | 22 | 1.1% |
| Other values (198) | 1751 |
| Value | Count | Frequency (%) |
| 15 | 6 | |
| 15.2 | 2 | 0.1% |
| 15.4 | 1 | 0.1% |
| 15.5 | 1 | 0.1% |
| 15.6 | 1 | 0.1% |
| 15.8 | 1 | 0.1% |
| 15.9 | 1 | 0.1% |
| 16 | 1 | 0.1% |
| 16.1 | 1 | 0.1% |
| 16.3 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 41.4 | 1 | |
| 38.8 | 1 | |
| 38.6 | 1 | |
| 38.3 | 1 | |
| 36.9 | 1 | |
| 36.6 | 1 | |
| 36.5 | 1 | |
| 36.4 | 2 | |
| 36.3 | 2 | |
| 36.2 | 2 |
Physical_Activity_Level
Real number (ℝ)
Zeros
| Distinct | 11 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.9385 |
| Minimum | 0 |
|---|---|
| Maximum | 10 |
| Zeros | 214 |
| Zeros (%) | 10.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 15.8 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.1660274 |
|---|---|
| Coefficient of variation (CV) | 0.6410909 |
| Kurtosis | -1.2055324 |
| Mean | 4.9385 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.010345111 |
| Sum | 9877 |
| Variance | 10.02373 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 214 | |
| 5 | 204 | |
| 7 | 195 | |
| 9 | 191 | |
| 2 | 188 | |
| 6 | 177 | |
| 3 | 172 | |
| 4 | 171 | |
| 8 | 166 | |
| 10 | 165 |
| Value | Count | Frequency (%) |
| 0 | 214 | |
| 1 | 157 | |
| 2 | 188 | |
| 3 | 172 | |
| 4 | 171 | |
| 5 | 204 | |
| 6 | 177 | |
| 7 | 195 | |
| 8 | 166 | |
| 9 | 191 |
| Value | Count | Frequency (%) |
| 10 | 165 | |
| 9 | 191 | |
| 8 | 166 | |
| 7 | 195 | |
| 6 | 177 | |
| 5 | 204 | |
| 4 | 171 | |
| 3 | 172 | |
| 2 | 188 | |
| 1 | 157 |
Risk_Level
Categorical
High correlation
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 106.4 KiB |
| Medium | |
|---|---|
| Low | |
| High | 102 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 5.412 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Medium |
|---|---|
| 2nd row | Medium |
| 3rd row | Medium |
| 4th row | Low |
| 5th row | Medium |
Common Values
| Value | Count | Frequency (%) |
| Medium | 1574 | |
| Low | 324 | 16.2% |
| High | 102 | 5.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| medium | 1574 | |
| low | 324 | 16.2% |
| high | 102 | 5.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 1676 | |
| M | 1574 | |
| e | 1574 | |
| d | 1574 | |
| u | 1574 | |
| m | 1574 | |
| L | 324 | 3.0% |
| o | 324 | 3.0% |
| w | 324 | 3.0% |
| H | 102 | 0.9% |
| Other values (2) | 204 | 1.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10824 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 1676 | |
| M | 1574 | |
| e | 1574 | |
| d | 1574 | |
| u | 1574 | |
| m | 1574 | |
| L | 324 | 3.0% |
| o | 324 | 3.0% |
| w | 324 | 3.0% |
| H | 102 | 0.9% |
| Other values (2) | 204 | 1.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10824 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 1676 | |
| M | 1574 | |
| e | 1574 | |
| d | 1574 | |
| u | 1574 | |
| m | 1574 | |
| L | 324 | 3.0% |
| o | 324 | 3.0% |
| w | 324 | 3.0% |
| H | 102 | 0.9% |
| Other values (2) | 204 | 1.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10824 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 1676 | |
| M | 1574 | |
| e | 1574 | |
| d | 1574 | |
| u | 1574 | |
| m | 1574 | |
| L | 324 | 3.0% |
| o | 324 | 3.0% |
| w | 324 | 3.0% |
| H | 102 | 0.9% |
| Other values (2) | 204 | 1.9% |
Interactions
Correlations
| Age | Air_Pollution | Alcohol_Use | BMI | BRCA_Mutation | Calcium_Intake | Cancer_Type | Diet_Red_Meat | Diet_Salted_Processed | Family_History | Fruit_Veg_Intake | Gender | H_Pylori_Infection | Obesity | Occupational_Hazards | Overall_Risk_Score | Physical_Activity | Physical_Activity_Level | Risk_Level | Smoking | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Age | 1.000 | -0.024 | -0.031 | -0.011 | 0.000 | 0.053 | 0.219 | -0.053 | -0.069 | 0.040 | 0.014 | 0.291 | 0.036 | -0.006 | 0.027 | -0.045 | 0.057 | -0.046 | 0.031 | 0.030 |
| Air_Pollution | -0.024 | 1.000 | 0.070 | 0.028 | 0.000 | 0.054 | 0.268 | -0.078 | 0.032 | 0.000 | -0.045 | 0.000 | 0.000 | -0.071 | 0.085 | 0.500 | 0.077 | 0.004 | 0.280 | 0.461 |
| Alcohol_Use | -0.031 | 0.070 | 1.000 | -0.013 | 0.034 | -0.063 | 0.000 | -0.023 | -0.027 | 0.000 | 0.040 | 0.024 | 0.059 | 0.007 | 0.001 | 0.386 | 0.037 | 0.018 | 0.223 | 0.114 |
| BMI | -0.011 | 0.028 | -0.013 | 1.000 | 0.082 | 0.022 | 0.000 | 0.036 | -0.010 | 0.000 | -0.012 | 0.000 | 0.000 | -0.003 | 0.002 | 0.034 | -0.004 | -0.004 | 0.047 | -0.006 |
| BRCA_Mutation | 0.000 | 0.000 | 0.034 | 0.082 | 1.000 | 0.056 | 0.148 | 0.000 | 0.000 | 0.000 | 0.019 | 0.072 | 0.000 | 0.039 | 0.000 | 0.027 | 0.018 | 0.000 | 0.000 | 0.020 |
| Calcium_Intake | 0.053 | 0.054 | -0.063 | 0.022 | 0.056 | 1.000 | 0.174 | 0.099 | 0.052 | 0.014 | -0.020 | 0.303 | 0.033 | -0.111 | 0.075 | 0.055 | -0.006 | -0.006 | 0.041 | 0.073 |
| Cancer_Type | 0.219 | 0.268 | 0.000 | 0.000 | 0.148 | 0.174 | 1.000 | 0.301 | 0.155 | 0.000 | 0.134 | 0.612 | 0.047 | 0.221 | 0.206 | 0.141 | 0.000 | 0.066 | 0.163 | 0.361 |
| Diet_Red_Meat | -0.053 | -0.078 | -0.023 | 0.036 | 0.000 | 0.099 | 0.301 | 1.000 | 0.184 | 0.000 | -0.188 | 0.058 | 0.000 | -0.051 | -0.010 | 0.264 | -0.004 | 0.034 | 0.160 | -0.145 |
| Diet_Salted_Processed | -0.069 | 0.032 | -0.027 | -0.010 | 0.000 | 0.052 | 0.155 | 0.184 | 1.000 | 0.000 | -0.223 | 0.000 | 0.144 | -0.037 | 0.058 | 0.352 | -0.023 | -0.004 | 0.204 | -0.059 |
| Family_History | 0.040 | 0.000 | 0.000 | 0.000 | 0.000 | 0.014 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.016 | 0.029 | 0.040 | 0.030 | 0.052 | 0.053 | 0.000 |
| Fruit_Veg_Intake | 0.014 | -0.045 | 0.040 | -0.012 | 0.019 | -0.020 | 0.134 | -0.188 | -0.223 | 0.000 | 1.000 | 0.000 | 0.157 | 0.010 | -0.050 | -0.147 | 0.017 | -0.010 | 0.101 | 0.040 |
| Gender | 0.291 | 0.000 | 0.024 | 0.000 | 0.072 | 0.303 | 0.612 | 0.058 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.216 | 0.048 | 0.058 | 0.079 | 0.082 | 0.000 | 0.133 |
| H_Pylori_Infection | 0.036 | 0.000 | 0.059 | 0.000 | 0.000 | 0.033 | 0.047 | 0.000 | 0.144 | 0.000 | 0.157 | 0.000 | 1.000 | 0.042 | 0.047 | 0.008 | 0.071 | 0.000 | 0.014 | 0.097 |
| Obesity | -0.006 | -0.071 | 0.007 | -0.003 | 0.039 | -0.111 | 0.221 | -0.051 | -0.037 | 0.016 | 0.010 | 0.216 | 0.042 | 1.000 | 0.001 | 0.217 | 0.011 | 0.020 | 0.125 | -0.090 |
| Occupational_Hazards | 0.027 | 0.085 | 0.001 | 0.002 | 0.000 | 0.075 | 0.206 | -0.010 | 0.058 | 0.029 | -0.050 | 0.048 | 0.047 | 0.001 | 1.000 | 0.360 | -0.002 | 0.043 | 0.177 | -0.010 |
| Overall_Risk_Score | -0.045 | 0.500 | 0.386 | 0.034 | 0.027 | 0.055 | 0.141 | 0.264 | 0.352 | 0.040 | -0.147 | 0.058 | 0.008 | 0.217 | 0.360 | 1.000 | 0.048 | 0.046 | 0.818 | 0.434 |
| Physical_Activity | 0.057 | 0.077 | 0.037 | -0.004 | 0.018 | -0.006 | 0.000 | -0.004 | -0.023 | 0.030 | 0.017 | 0.079 | 0.071 | 0.011 | -0.002 | 0.048 | 1.000 | 0.023 | 0.070 | 0.100 |
| Physical_Activity_Level | -0.046 | 0.004 | 0.018 | -0.004 | 0.000 | -0.006 | 0.066 | 0.034 | -0.004 | 0.052 | -0.010 | 0.082 | 0.000 | 0.020 | 0.043 | 0.046 | 0.023 | 1.000 | 0.006 | 0.020 |
| Risk_Level | 0.031 | 0.280 | 0.223 | 0.047 | 0.000 | 0.041 | 0.163 | 0.160 | 0.204 | 0.053 | 0.101 | 0.000 | 0.014 | 0.125 | 0.177 | 0.818 | 0.070 | 0.006 | 1.000 | 0.243 |
| Smoking | 0.030 | 0.461 | 0.114 | -0.006 | 0.020 | 0.073 | 0.361 | -0.145 | -0.059 | 0.000 | 0.040 | 0.133 | 0.097 | -0.090 | -0.010 | 0.434 | 0.100 | 0.020 | 0.243 | 1.000 |
Missing values
Sample
| Patient_ID | Cancer_Type | Age | Gender | Smoking | Alcohol_Use | Obesity | Family_History | Diet_Red_Meat | Diet_Salted_Processed | Fruit_Veg_Intake | Physical_Activity | Air_Pollution | Occupational_Hazards | BRCA_Mutation | H_Pylori_Infection | Calcium_Intake | Overall_Risk_Score | BMI | Physical_Activity_Level | Risk_Level | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | LU0000 | Breast | 68 | 0 | 7 | 2 | 8 | 0 | 5 | 3 | 7 | 4 | 6 | 3 | 1 | 0 | 0 | 0.398696 | 28.0 | 5 | Medium |
| 1 | LU0001 | Prostate | 74 | 1 | 8 | 9 | 8 | 0 | 0 | 3 | 7 | 1 | 3 | 3 | 0 | 0 | 5 | 0.424299 | 25.4 | 9 | Medium |
| 2 | LU0002 | Skin | 55 | 1 | 7 | 10 | 7 | 0 | 3 | 3 | 4 | 1 | 8 | 10 | 0 | 0 | 6 | 0.605082 | 28.6 | 2 | Medium |
| 3 | LU0003 | Colon | 61 | 0 | 6 | 2 | 2 | 0 | 6 | 2 | 4 | 6 | 4 | 8 | 0 | 0 | 8 | 0.318449 | 32.1 | 7 | Low |
| 4 | LU0004 | Lung | 67 | 1 | 10 | 7 | 4 | 0 | 6 | 3 | 10 | 9 | 10 | 9 | 0 | 0 | 5 | 0.524358 | 25.1 | 2 | Medium |
| 5 | LU0005 | Lung | 77 | 1 | 10 | 8 | 3 | 0 | 6 | 0 | 6 | 2 | 10 | 7 | 0 | 0 | 0 | 0.498668 | 25.1 | 1 | Medium |
| 6 | LU0006 | Lung | 59 | 0 | 10 | 10 | 0 | 0 | 9 | 4 | 0 | 1 | 10 | 9 | 0 | 0 | 5 | 0.662354 | 32.3 | 2 | High |
| 7 | LU0007 | Prostate | 74 | 1 | 8 | 6 | 2 | 1 | 3 | 3 | 2 | 8 | 8 | 7 | 0 | 0 | 1 | 0.479367 | 29.1 | 9 | Medium |
| 8 | LU0008 | Colon | 71 | 1 | 9 | 0 | 3 | 0 | 10 | 4 | 6 | 10 | 8 | 3 | 0 | 0 | 5 | 0.497620 | 24.1 | 5 | Medium |
| 9 | LU0009 | Skin | 55 | 1 | 7 | 1 | 2 | 0 | 0 | 4 | 2 | 5 | 9 | 9 | 0 | 0 | 5 | 0.404837 | 28.2 | 1 | Medium |
| Patient_ID | Cancer_Type | Age | Gender | Smoking | Alcohol_Use | Obesity | Family_History | Diet_Red_Meat | Diet_Salted_Processed | Fruit_Veg_Intake | Physical_Activity | Air_Pollution | Occupational_Hazards | BRCA_Mutation | H_Pylori_Infection | Calcium_Intake | Overall_Risk_Score | BMI | Physical_Activity_Level | Risk_Level | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1990 | ST0390 | Skin | 52 | 1 | 1 | 0 | 5 | 0 | 6 | 10 | 7 | 3 | 7 | 10 | 0 | 0 | 10 | 0.494717 | 29.2 | 2 | Medium |
| 1991 | ST0391 | Lung | 56 | 0 | 10 | 0 | 1 | 0 | 8 | 7 | 6 | 3 | 10 | 8 | 0 | 0 | 7 | 0.591332 | 29.0 | 6 | Medium |
| 1992 | ST0392 | Colon | 46 | 1 | 1 | 2 | 9 | 0 | 10 | 10 | 0 | 9 | 4 | 5 | 0 | 1 | 6 | 0.483890 | 30.8 | 1 | Medium |
| 1993 | ST0393 | Lung | 63 | 0 | 10 | 3 | 9 | 0 | 0 | 5 | 5 | 4 | 9 | 1 | 0 | 0 | 5 | 0.422999 | 28.8 | 10 | Medium |
| 1994 | ST0394 | Skin | 67 | 1 | 5 | 1 | 6 | 0 | 3 | 0 | 4 | 0 | 5 | 10 | 0 | 0 | 4 | 0.343532 | 27.4 | 5 | Medium |
| 1995 | ST0395 | Colon | 60 | 1 | 4 | 6 | 4 | 0 | 10 | 6 | 4 | 4 | 5 | 3 | 1 | 0 | 4 | 0.437539 | 30.3 | 3 | Medium |
| 1996 | ST0396 | Prostate | 84 | 1 | 5 | 7 | 8 | 0 | 10 | 0 | 1 | 2 | 1 | 3 | 0 | 0 | 2 | 0.451128 | 25.9 | 4 | Medium |
| 1997 | ST0397 | Lung | 65 | 0 | 7 | 2 | 10 | 0 | 4 | 2 | 2 | 3 | 6 | 0 | 0 | 1 | 0 | 0.295760 | 22.5 | 3 | Low |
| 1998 | ST0398 | Lung | 64 | 1 | 10 | 2 | 10 | 0 | 2 | 10 | 7 | 5 | 4 | 2 | 0 | 0 | 10 | 0.422201 | 25.3 | 3 | Medium |
| 1999 | ST0399 | Breast | 64 | 0 | 3 | 4 | 10 | 0 | 0 | 5 | 1 | 0 | 3 | 9 | 0 | 0 | 0 | 0.518137 | 23.0 | 3 | Medium |